The global Internet has become a mass media on par with radio and television. And just like radio and television content, the content on the Internet is largely supported by advertising dollars. The main advertising supported portion of the Internet is the “World Wide Web” that displays HyperText Mark-Up Language (HTML) documents distributed using the HyperText Transport Protocol (HTTP).
Two of the most common types of advertisements on the World Wide Web portion of the Internet are banner advertisements and text link advertisements. Banner advertisements are generally images or animations that are displayed within an Internet web page. Text link advertisements are generally short segments of text that are linked to the advertiser's web site.
With any advertising-supported business model, there needs to be some metrics for assigning monetary value to the advertising. Radio stations and television stations use ratings services that assess how many people are listening to a particular radio program or watching a particular television program in order to assign a monetary value to advertising on that particular program. Radio and television programs with more listeners or watchers are assigned larger monetary values for advertising. With Internet banner type advertisements, a similar metric may be used. For example, the metric may be the number of times that a particular Internet banner advertisement is displayed to people browsing various web sites. Each display of an internet advertisement to a web viewer is known as an “impression.”
In contrast to traditional mass media, the internet allows for interactivity between the media publisher and the media consumer. Thus, when an internet advertisement is displayed to a web viewer, the internet advertisement may include a link that points to another web site where the web viewer may obtain additional information about the advertised product or service. Thus, a web viewer may ‘click’ on an internet advertisement and be directed to that web site containing the additional information on the advertised product or service. When a web viewer selects an advertisement, this is known as a ‘click through’ since the web viewer ‘clicks through’ the advertisement to see the advertiser's web site.
A click-through clearly has value to the advertiser since an interested web viewer has indicated a desire to see the advertiser's web site. Thus, an entity wishing to advertise on the internet may wish to pay for such click-through events instead of paying for displayed internet advertisements. Many Internet advertising services have therefore been offering internet advertising wherein advertisers only pay for web viewers that click on the web based advertisements. This type of advertising model is often referred to as the “pay-per-click” advertising model since the advertisers only pay when a web viewer clicks on an advertisement.
With such pay-per-click advertising models, internet advertising services must display advertisements that are most likely to capture the interest of the web viewer to maximize the advertising fees that may be charged. In order to achieve this goal, it would be desirable to be able to selecting advertisements that most closely match the context that the advertising is being displayed within. In other words, the selected advertisement should be relevant to the surrounding content. Thus, advertisements are often placed in contests that match the product at a topical level. For example, an advertisement for running shoes may be placed on a sport news page. Simple information retrieval systems have been designed to capture such “relevance.” Examples of such information retrieval systems can be found in the book “Modern Information Retrieval” by Baeza-Yates, R. and Ribeiro-Neto, B. A., ACM Press/Addison-Wesley, 1999.
However, advertisements are not placed on the basis of topical relevance alone. For example, an advertisement for running shoes might be appropriate and effective on a web page comparing MP3 players since running shoes and MP3 players share a target audience, namely recreational runners. Thus, although MP3 players and running shoes are very different topics (and may share no common vocabulary) MP3 players and running shoes are very closely linked on an advertising basis. Conversely, there may be advertisements that are very topically similar to a potential Web page but cannot be placed in that web page because they are inappropriate. For example, it would be inappropriate to put an advertisement for a particular product in the web page of that product's direct competitor.
Furthermore, the language of advertising is rich and complex. For example, the phrase “I can't believe it's not butter!” implies at once that butter is the gold standard, and that this product is indistinguishable from butter. Understanding advertisement involves inference processes which can be quite sophisticated and well beyond what traditional information retrieval systems are designed to cope with. Due to these difficulties, it would be desirable to have systems that extend beyond simple concepts of relevance handled by existing information retrieval systems.